# [NIPS 2024 in submission] 

<!-- This is PyTorch implementation of Continuous Contrastive Learning for Long-Tailed Semi-Supervised Recognition. -->
This is PyTorch implementation of [Continuous Contrastive Learning for Long-Tailed Semi-Supervised Recognition] at NIPS 2024 in submission.

## Abstract
Long-tailed semi-supervised learning is a challenging task that aims to train a model using limited labeled data that exhibit a long-tailed label distribution. State-of-the-art LTSSL methods rely on high-quality pseudo-labels assigned for large-scale unlabeled data. However, most of them overlook the effect of representations learned by the network and are often less effective when dealing with unlabeled data from the real world, which typically follows a distinct distribution compared to the labeled data.In this paper, we present a probabilistic framework that unifies many recent proposals to cope with long-tail learning problems. Our framework deduces the class-balanced supervised contrastive loss when employing the Gaussian kernel density estimation, and generalizes to unlabeled data based on reliable and smoothed continuous pseudo-labels. We progressively estimate the underlying distribution and optimize its alignment with model predictions to address the agnostic distribution of unlabeled data in the real world.Extensive experiments on several datasets with varying unlabeled data distribution show the advantage of our proposal over previous state-of-the-art methods, e.g., more than 4% improvements on ImageNet-127. The source code is available in the supplementary material.

## Method

<p align = "center">
<img src="assets/model.png" width="90%" />
</p>

## Requirements

- Python 3.7.13
- PyTorch 1.12.0+cu116
- torchvision
- numpy



## Dataset

The directory structure for datasets looks like:
```
datasets
├── cifar-10
├── cifar-100
├── stl-10
├── imagenet32
└── imagenet64
```


## Usage

Train our proposed CCL on CIFAR100-LT of different settings.

For consistent:

```
python train.py --dataset cifar100 --num-max 50 --num-max-u 400 --arch wideresnet --batch-size 64 --lr 0.03 --seed 0 --imb-ratio-label 10 --imb-ratio-unlabel 10  --out out/cifar-100/N50_M400/consistent
```

For uniform:

```
python train.py --dataset cifar100 --num-max 50 --num-max-u 400 --arch wideresnet --batch-size 64 --lr 0.03 --seed 0 --imb-ratio-label 10 --imb-ratio-unlabel 1 --out out/cifar-100/N50_M400/uniform  
```

For reversed:

```
python train.py --dataset cifar100 --num-max 50 --num-max-u 400 --arch wideresnet --batch-size 64 --lr 0.03 --seed 0 --imb-ratio-label 10 --imb-ratio-unlabel 10  --flag-reverse-LT 1 --out out/cifar-100/N50_M400/reversed
```

## Acknowledgement
Our code of CCL is based on the implementation of FixMatch. We thank the authors of the [FixMatch](https://github.com/kekmodel/FixMatch-pytorch) for making their code available to the public.





